Search CORE

279 research outputs found

Probabilistic Graphical Model Representation in Phylogenetics

Author: Boussau Bastien
Heath Tracy A.
Huelsenbeck John P.
Höhna Sebastian
Landis Michael J.
Ronquist Fredrik
Publication venue
Publication date: 09/12/2013
Field of study

Recent years have seen a rapid expansion of the model space explored in statistical phylogenetics, emphasizing the need for new approaches to statistical model representation and software development. Clear communication and representation of the chosen model is crucial for: (1) reproducibility of an analysis, (2) model development and (3) software design. Moreover, a unified, clear and understandable framework for model representation lowers the barrier for beginners and non-specialists to grasp complex phylogenetic models, including their assumptions and parameter/variable dependencies. Graphical modeling is a unifying framework that has gained in popularity in the statistical literature in recent years. The core idea is to break complex models into conditionally independent distributions. The strength lies in the comprehensibility, flexibility, and adaptability of this formalism, and the large body of computational work based on it. Graphical models are well-suited to teach statistical models, to facilitate communication among phylogeneticists and in the development of generic software for simulation and statistical inference. Here, we provide an introduction to graphical models for phylogeneticists and extend the standard graphical model representation to the realm of phylogenetics. We introduce a new graphical model component, tree plates, to capture the changing structure of the subgraph corresponding to a phylogenetic tree. We describe a range of phylogenetic models using the graphical model framework and introduce modules to simplify the representation of standard components in large and complex models. Phylogenetic model graphs can be readily used in simulation, maximum likelihood inference, and Bayesian inference using, for example, Metropolis-Hastings or Gibbs sampling of the posterior distribution

arXiv.org e-Print Archive

KU ScholarWorks

PubMed Central

RevBayes: Bayesian Phylogenetic Inference Using Graphical Models and an Interactive Model-Specification Language.

Author: Boussau Bastien
Heath Tracy
Huelsenbeck John
Höhna Sebastian
Landis Michael
Lartillot Nicolas
Moore Brian
Ronquist Fredrik
Publication venue: eScholarship, University of California
Publication date: 01/01/2016
Field of study

Programs for Bayesian inference of phylogeny currently implement a unique and ﬁxed suite of models. Consequently, users of these software packages are simultaneously forced to use a number of programs for a given study, while also lacking the freedom to explore models that have not been implemented by the developers of those programs. We developed a new open-source software package, RevBayes, to address these problems. RevBayes is entirely based on probabilistic graphical models, a powerful generic framework for specifying and analyzing statistical models. Phylogenetic-graphical models can be speciﬁed interactively in RevBayes, piece by piece, using a new succinct and intuitive language called Rev. Rev is similar to the R language and the BUGS model-speciﬁcation language, and should be easy to learn for most users. The strength of RevBayes is the simplicity with which one can design, specify, and implement new and complex models. Fortunately, this tremendous ﬂexibility does not come at the cost of slower computation; as we demonstrate, RevBayes outperforms competing software for several standard analyses. Compared with other programs, RevBayes has fewer black-box elements. Users need to explicitly specify each part of the model and analysis. Although this explicitness may initially be unfamiliar, we are convinced that this transparency will improve understanding of phylogenetic models in our ﬁeld. Moreover, it will motivate the search for improvements to existing methods by brazenly exposing the model choices that we make to critical scrutiny. RevBayes is freely available at http://www.RevBayes.com [Bayesian inference; Graphical models; MCMC; statistical phylogenetics.]

Digital Repository @ Iowa State University (ISU)

INRIA a CCSD electronic archive server

PubMed Central

HAL Descartes

eScholarship - University of California

A Bayesian framework for the analysis of cospeciation.

Author: Bret Larget
Bruce Rannala
John P Huelsenbeck
Publication venue
Publication date: 01/01/2000
Field of study

Abstract. Information on the history of cospeciation and host switching for a group of host and parasite species is contained in the DNA sequences sampled from each. Here, we develop a Bayesian framework for the analysis of cospeciation. We suggest a simple model of host switching by a parasite on a host phylogeny in which host switching events are assumed to occur at a constant rate over the entire evolutionary history of associated hosts and parasites. The posterior probability density of the parameters of the model of host switching are evaluated numerically using Markov chain Monte Carlo. In particular, the method generates the probability density of the number of host switches and of the host switching rate. Moreover, the method provides information on the probability that an event of host switching is associated with a particular pair of branches. A Bayesian approach has several advantages over other methods for the analysis of cospeciation. In particular, it does not assume that the host or parasite phylogenies are known without error; many alternative phylogenies are sampled in proportion to their probability of being correct

CiteSeerX

MrBayes 3.2: Efficient Bayesian Phylogenetic Inference and Model Choice Across a Large Model Space

Author: Aaron Darling
Altekar
Ayres
Bret Larget
Daniel L. Ayres
Drummond
Edwards
Fredrik Ronquist
Gelman
Gernhard
Goldman
Huelsenbeck
Huelsenbeck
Huelsenbeck
Höhna
Höhna
John P. Huelsenbeck
Lakner
Larget
Lartillot
Lepage
Liang Liu
Liu
Marc A. Suchard
Mau
Mau
Maxim Teslenko
Newton
Paul van der Mark
Posada
Posada
Rannala
Roberts
Ronquist
Ronquist
Ronquist
Sebastian Höhna
Stadler
Suchard
Thorne
Xie
Yang
Publication venue: Oxford University Press
Publication date: 01/05/2012
Field of study

Since its introduction in 2001, MrBayes has grown in popularity as a software package for Bayesian phylogenetic inference using Markov chain Monte Carlo (MCMC) methods. With this note, we announce the release of version 3.2, a major upgrade to the latest official release presented in 2003. The new version provides convergence diagnostics and allows multiple analyses to be run in parallel with convergence progress monitored on the fly. The introduction of new proposals and automatic optimization of tuning parameters has improved convergence for many problems. The new version also sports significantly faster likelihood calculations through streaming single-instruction-multiple-data extensions (SSE) and support of the BEAGLE library, allowing likelihood calculations to be delegated to graphics processing units (GPUs) on compatible hardware. Speedup factors range from around 2 with SSE code to more than 50 with BEAGLE for codon problems. Checkpointing across all models allows long runs to be completed even when an analysis is prematurely terminated. New models include relaxed clocks, dating, model averaging across time-reversible substitution models, and support for hard, negative, and partial (backbone) tree constraints. Inference of species trees from gene trees is supported by full incorporation of the Bayesian estimation of species trees (BEST) algorithms. Marginal model likelihoods for Bayes factor tests can be estimated accurately across the entire model space using the stepping stone method. The new version provides more output options than previously, including samples of ancestral states, site rates, site dN/dS rations, branch rates, and node dates. A wide range of statistics on tree parameters can also be output for visualization in FigTree and compatible software

Crossref

OPUS - University of Technology Sydney

PubMed Central

eScholarship - University of California

BEAGLE: An Application Programming Interface and High-Performance Computing Library for Statistical Phylogenetics

Author: Ayres Daniel L.
Beerli Peter
Cummings Michael P.
Darling Aaron
Holder Mark T.
Huelsenbeck John P.
Lewis Paul O.
Rambaut Andrew
Ronquist Fredrik
Swofford David L.
Zwickl Derrick J.
Publication venue: 'Oxford University Press (OUP)'
Publication date: 15/04/2014
Field of study

Phylogenetic inference is fundamental to our understanding of most aspects of the origin and evolution of life, and in recent years, there has been a concentration of interest in statistical approaches such as Bayesian inference and maximum likelihood estimation. Yet, for large data sets and realistic or interesting models of evolution, these approaches remain computationally demanding. High-throughput sequencing can yield data for thousands of taxa, but scaling to such problems using serial computing often necessitates the use of nonstatistical or approximate approaches. The recent emergence of graphics processing units (GPUs) provides an opportunity to leverage their excellent floating-point computational performance to accelerate statistical phylogenetic inference. A specialized library for phylogenetic calculation would allow existing software packages to make more effective use of available computer hardware, including GPUs. Adoption of a common library would also make it easier for other emerging computing architectures, such as field programmable gate arrays, to be used in the future. We present BEAGLE, an application programming interface (API) and library for high-performance statistical phylogenetic inference. The API provides a uniform interface for performing phylogenetic likelihood calculations on a variety of compute hardware platforms. The library includes a set of efficient implementations and can currently exploit hardware including GPUs using NVIDIA CUDA, central processing units (CPUs) with Streaming SIMD Extensions and related processor supplementary instruction sets, and multicore CPUs via OpenMP. To demonstrate the advantages of a common API, we have incorporated the library into several popular phylogenetic software packages. The BEAGLE library is free open source software licensed under the Lesser GPL and available from http://beagle-lib.googlecode.com. An example client program is available as public domain software.This work was supported by the National Science Foundation [grant numbers DBI-0755048, DEB-0732920, DEB-1036448, DMS-0931642, EF-0331495, EF-0905606, EF-0949453]; the National Institutes of Health [grant numbers R01-HG006139, R01-GM037841, R01-GM078985, R01-GM086887, R01-NS063897]; the Biotechnology and Biological Sciences Research Council [grant number BB/H011285/1]; the Wellcome Trust [grant number WT092807MA]; and Google Summer of Code

KU ScholarWorks

BEAGLE: An Application Programming Interface and High-Performance Computing Library for Statistical Phylogenetics

Author: Aaron Darling
Andrew Rambaut
Daniel L. Ayres
David L. Swofford
Derrick J. Zwickl
Drummond
Felsenstein
Fredrik Ronquist
John P. Huelsenbeck
Marc A. Suchard
Mark T. Holder
Michael P. Cummings
Paul O. Lewis
Peter Beerli
Regier
Ronquist
Suchard
Swofford
Zwickl
Publication venue: Oxford University Press
Publication date: 01/10/2011
Field of study

CiteSeerX

Crossref

OPUS - University of Technology Sydney

KU ScholarWorks

PubMed Central

Edinburgh Research Explorer

eScholarship - University of California

The Arthrobacter Species FB24 Arth_1007 (DnaB) Intein Is a Pseudogene

Author: A Romanelli
CJ Noren
FB Perler
FB Perler
Francine B. Perler
H Paulus
John R. Battista
JP Huelsenbeck
JZ Dalgaard
K Tori
K Tori
Kazuo Tori
KV Mills
LE Brace
M Kawasaki
MW Southworth
S Pietrokovski
S Pietrokovski
VM Markowitz
Publication venue: Public Library of Science
Publication date: 01/01/2011
Field of study

An Arthrobacter species FB24 gene (locus tag Arth_1007) was previously annotated as a putative intein-containing DnaB helicase of phage origin (Arsp-FB24 DnaB intein). However, it is not a helicase gene because the sequence similarity is limited to inteins. In fact, the flanking exteins total only 66 amino acids. Therefore, the intein should be referred to as the Arsp-FB24 Arth_1007 intein. The Arsp-FB24 Arth_1007 intein failed to splice in its native precursor and in a model precursor. We previously noted that the Arsp-FB24 Arth_1007 intein is the only putative Class 3 intein that is missing the catalytically essential Cys at position 4 of intein Motif F, which is one of the three defining signature residues of this class. Additionally, a catalytically essential His in position 10 of intein Motif B is also absent; this His is the most conserved residue amongst all inteins. Splicing activity was not rescued when these two catalytically important positions were ‘reverted’ back to their consensus residues. This study restores the unity of the Class 3 intein signature sequence in active inteins by demonstrating that the Arsp-FB24 Arth_1007 intein is an inactive pseudogene

CiteSeerX

Crossref

Directory of Open Access Journals

PubMed Central